-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow PermutedDimsArray
in gemm_strided_batched
#539
Conversation
PermutedDimsArray
in gemm_strided_batchedPermutedDimsArray
in gemm_strided_batched
It serves as documentation too. Looking at the name of the function one could expect being able to use it with actual strided inputs (i.e. a non-contiguous view); which the Please rebase on latest master to get CI. |
It's a pity to lose the documentation aspect, but is there another way? Inserting some huge union isn't going to result in a readable Can rebase but I think CI won't work until NNlib is tagged. |
f1aadc3
to
0692e58
Compare
Oh, right. You can always add an explicit dep to the Manifest if you want to run CI before that happens.
Eh, maybe just add a comment at the end of the line. |
This reverts commit 0692e58.
not needed, as CUDA only supports Julia 1.5 and up.
Good point about tests, I've added a few. And put NNlib#master in the manifest to test on CI. (Documentation still fails, I presume because it's on a different manifest.) This needs something to make pointers, possibly just |
These conversions live in Lines 370 to 490 in 1139ffd
|
Great, I've moved the pointer as you describe. NNlib is now tagged, and I adjusted the bound in Project.toml. But CI doesn't seem to see v0.7.7, perhaps it needs more time. |
Sometimes the PkgServer lags General a bit. I restarted the build, and there's a normal failure in the NNlib tests now. |
Codecov Report
@@ Coverage Diff @@
## master #539 +/- ##
=======================================
Coverage 77.82% 77.83%
=======================================
Files 116 116
Lines 6423 6425 +2
=======================================
+ Hits 4999 5001 +2
Misses 1424 1424
Continue to review full report at Codecov.
|
All green, let's merge this! |
Thanks! And sorry about the excessive CI-ing, trying to finish this while away from my desktop maybe wasn't the brightest plan. |
No problem, that's what its there for. But it prompted me to add a smoke test that has to succeed before starting 10 other jobs 😄 |
This is copied from JuliaGPU/CuArrays.jl#664, which needed FluxML/NNlib.jl#191, now merged (but not yet tagged).
Summarising:
CUBLAS's
gemm_strided_batched!
accepts strides apart from the ordered ones of aCuArray
. This is potentially useful as it saves on performingpermutedims
before operations, and the obvious way to expose this is to let it acceptPermutedDimsArray
s.I presume that by the time anyone is calling
CuArrays.CUBLAS.gemm_strided_batched!
, they are well aware that the operation is to be done by CUBLAS, and not expecting dispatch to redirect to the CPU if it's an Array. So I changed this to accept anyAbstractArray
.Then
NNlib.batched_mul
will be a friendly function which dispatches every kind of array to the best routine. I hope I got this sorted out! To perform e.g. matrix * 3-tensor contractions like the table here, the matrix should be reshaped to another 3-array, with a size=1 dimension to indicate no batch index, or batched matrix-vector, etc.